Hyperparameter Transfer Learning through Surrogate Alignment for Efficient Deep Neural Network Training
نویسندگان
چکیده
Recently, several optimization methods have been successfully applied to the hyperparameter optimization of deep neural networks (DNNs). The methods work by modeling the joint distribution of hyperparameter values and corresponding error. Those methods become less practical when applied to modern DNNs whose training may take a few days and thus one cannot collect sufficient observations to accurately model the distribution. To address this challenging issue, we propose a method that learns to transfer optimal hyperparameter values for a small source dataset to hyperparameter values with comparable performance on a dataset of interest. As opposed to existing transfer learning methods, our proposed method does not use hand-designed features. Instead, it uses surrogates to model the hyperparameter-error distributions of the two datasets and trains a neural network to learn the transfer function. Extensive experiments on three CV benchmark datasets clearly demonstrate the efficiency of our method.
منابع مشابه
Efficient Hyperparameter Optimization for Deep Learning Algorithms Using Deterministic RBF Surrogates
Automatically searching for optimal hyperparameter configurations is of crucial importance for applying deep learning algorithms in practice. Recently, Bayesian optimization has been proposed for optimizing hyperparameters of various machine learning algorithms. Those methods adopt probabilistic surrogate models like Gaussian processes to approximate and minimize the validation error function o...
متن کاملapsis - Framework for Automated Optimization of Machine Learning Hyper Parameters
Machine learning and the algorithms used for it have become more and more complex in the past years. Especially the growth of Deep Learning architectures has resulted in a large number of hyperparameters such as the number of hidden layers or the transfer function in a neural network which have to be tuned to achieve the best possible performance. Since the result of a hyperparameter choice can...
متن کاملMultiple deep convolutional neural networks averaging for face alignment
Face alignment is critical for face recognition, and the deep learning-based method shows promise for solving such issues, given that competitive results are achieved on benchmarks with additional benefits, such as dispensing with handcrafted features and initial shape. However, most existing deep learning-based approaches are complicated and quite time-consuming during training. We propose a c...
متن کاملBayesian Neural Networks for Predicting Learning Curves
The performance of deep neural networks (DNNs) crucially relies on good hyperparameter settings. Since the computational expense of training DNNs renders traditional blackbox optimization infeasible, recent advances in Bayesian optimization model the performance of iterative methods as a function of time to adaptively allocate more resources to promising hyperparameter settings. Here, we propos...
متن کاملCombination of Hyperband and Bayesian Optimization for Hyperparameter Optimization in Deep Learning
Deep learning has achieved impressive results on many problems. However, it requires high degree of expertise or a lot of experience to tune well the hyperparameters, and such manual tuning process is likely to be biased. Moreover, it is not practical to try out as many different hyperparameter configurations in deep learning as in other machine learning scenarios, because evaluating each singl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1608.00218 شماره
صفحات -
تاریخ انتشار 2016